Inference of variable-length acoustic units for continuous speech recognition
نویسندگان
چکیده
In the eld of speech recognition, the patterns assumed to structure the speech material (phonemes, triphones, words...) are de ned a priori according to a linguistic criterion, whereas the recognition criterion is based on an acoustic similarity measure. From this may result a lack of consistency for the recognition units. In this paper, we explore the possibility of a more data-driven approach, where recognition units are derived according to an acoustic criterion, and then, mapped to variable length sequences of phonemes in an unsupervised way. Continuous speech recognition experiments are reported to evaluate the consistency of those units as opposed to linguistically de ned units.
منابع مشابه
Inference of variable-length linguistic and acoustic units by multigrams
The efficiency of pattern recognition algorithms is highly conditioned to a proper definition of the patterns assumed to structure the data. The multigram model provides a statistical tool to retrieve sequential variable-length regularities within streams of data. In this paper, we present a general formulation of the model, applicable to single or multiple parallel strings of data having eithe...
متن کاملVariable-length acoustic units inference for text-to-speech synthesis
The best voices in text-to-speech synthesis are currently obtained via acoustic units concatenation-based systems. In such systems, the choice of units whose concatenations will produce an acoustic message is a crucial stage. Moreover, it can be observed that current TTS systems use acoustic units which most often correspond to variable-length phonetic descriptions. In this article, an original...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملLONGER−LENGTH ACOUSTIC UNITS FOR CONTINUOUS SPEECH RECOGNITION (ThuAmPO1)
Recent research on the TIMIT database suggests that longer−length acoustic units are better suited for modelling pronunciation variation and long−term temporal dependencies in speech than traditional phoneme−length units, yielding substantial improvements in recognition accuracy [9]. In this paper, we investigate whether similar improvements can be gained on another database, viz. excerpts from...
متن کاملSplit-lexicon based hierarchical recognition of speech using syllable and word level acoustic units
Most speech recognition systems, especially LVCSR, use context dependent phones as the basic acoustic unit for recognition. The primary motive for this is the relative ease with which phone based systems can be trained robustly with small amounts of data. However as recent research indicates, significant improvements in recognition accuracy can be gained by using acoustic units of longer durati...
متن کامل